Overview

Dataset statistics

Number of variables26
Number of observations128655
Missing cells23634
Missing cells (%)0.7%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory25.5 MiB
Average record size in memory208.0 B

Variable types

NUM14
CAT9
DATE3

Reproduction

Analysis started2021-01-30 07:11:53.156256
Analysis finished2021-01-30 07:13:08.903154
Duration1 minute and 15.75 seconds
Versionpandas-profiling v2.8.0
Command linepandas_profiling --config_file config.yaml [YOUR_FILE.csv]
Download configurationconfig.yaml

Warnings

Area has a high cardinality: 92 distinct values High cardinality
City has a high cardinality: 488 distinct values High cardinality
DisbursalAmount is highly correlated with AmountFinanceHigh correlation
AmountFinance is highly correlated with DisbursalAmountHigh correlation
Area has 11653 (9.1%) missing values Missing
City has 11256 (8.7%) missing values Missing
MonthlyIncome is highly skewed (γ1 = 357.3273301) Skewed
ID has unique values Unique
AssetID has unique values Unique

Variables

ID
Real number (ℝ≥0)

UNIQUE

Distinct count128655
Unique (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean70965.32655551669
Minimum1
Maximum143395
Zeros0
Zeros (%)0.0%
Memory size1005.1 KiB

Quantile statistics

Minimum1
5-th percentile6932.7
Q134408.5
median70988
Q3106549.5
95-th percentile136475.3
Maximum143395
Range143394
Interquartile range (IQR)72141

Descriptive statistics

Standard deviation41762.77928
Coefficient of variation (CV)0.5884955557
Kurtosis-1.220605837
Mean70965.32656
Median Absolute Deviation (MAD)36079
Skewness0.03192695249
Sum9130044088
Variance1744129734
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
20471< 0.1%
 
1406541< 0.1%
 
627721< 0.1%
 
648211< 0.1%
 
586781< 0.1%
 
607271< 0.1%
 
382001< 0.1%
 
402491< 0.1%
 
341061< 0.1%
 
484451< 0.1%
 
Other values (128645)128645> 99.9%
 
ValueCountFrequency (%) 
11< 0.1%
 
21< 0.1%
 
31< 0.1%
 
71< 0.1%
 
81< 0.1%
 
ValueCountFrequency (%) 
1433951< 0.1%
 
1433941< 0.1%
 
1433931< 0.1%
 
1433911< 0.1%
 
1433901< 0.1%
 

Frequency
Categorical

Distinct count4
Unique (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1005.1 KiB
Half Yearly
76248
Monthly
31150
Quatrly
20795
BI-Monthly
 
462
ValueCountFrequency (%) 
Half Yearly7624859.3%
 
Monthly3115024.2%
 
Quatrly2079516.2%
 
BI-Monthly4620.4%
 

Length

Max length11
Median length11
Mean length9.381392095
Min length7

InstlmentMode
Categorical

Distinct count2
Unique (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1005.1 KiB
Arrear
122349
Advance
 
6306
ValueCountFrequency (%) 
Arrear12234995.1%
 
Advance63064.9%
 

Length

Max length7
Median length6
Mean length6.049014807
Min length6

LoanStatus
Categorical

Distinct count2
Unique (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1005.1 KiB
Closed
94457
Active
34198
ValueCountFrequency (%) 
Closed9445773.4%
 
Active3419826.6%
 

Length

Max length6
Median length6
Mean length6
Min length6

PaymentMode
Categorical

Distinct count11
Unique (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1005.1 KiB
Direct Debit
31766
ECS
31390
PDC
26617
Billed
26486
PDC_E
9937
Other values (6)
 
2459
ValueCountFrequency (%) 
Direct Debit3176624.7%
 
ECS3139024.4%
 
PDC2661720.7%
 
Billed2648620.6%
 
PDC_E99377.7%
 
Auto Debit8430.7%
 
SI Reject7440.6%
 
Cheque4420.3%
 
ECS Reject4170.3%
 
Escrow7< 0.1%
 

Length

Max length12
Median length5
Mean length6.108305157
Min length3

BranchID
Real number (ℝ≥0)

Distinct count189
Unique (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean166.28967393416502
Minimum1
Maximum424
Zeros0
Zeros (%)0.0%
Memory size1005.1 KiB

Quantile statistics

Minimum1
5-th percentile10
Q150
median152
Q3274
95-th percentile340
Maximum424
Range423
Interquartile range (IQR)224

Descriptive statistics

Standard deviation115.8440477
Coefficient of variation (CV)0.6966400555
Kurtosis-1.342787683
Mean166.2896739
Median Absolute Deviation (MAD)111
Skewness0.1960464794
Sum21393998
Variance13419.84338
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
2444743.5%
 
19941943.3%
 
5028422.2%
 
16623811.9%
 
20222081.7%
 
20421261.7%
 
26321041.6%
 
920121.6%
 
4719521.5%
 
13319391.5%
 
Other values (179)10242379.6%
 
ValueCountFrequency (%) 
16210.5%
 
38240.6%
 
511350.9%
 
66660.5%
 
88900.7%
 
ValueCountFrequency (%) 
42419< 0.1%
 
4231220.1%
 
4218< 0.1%
 
41828< 0.1%
 
4169< 0.1%
 

Area
Categorical

HIGH CARDINALITY
MISSING

Distinct count92
Unique (%)0.1%
Missing11653
Missing (%)9.1%
Memory size1005.1 KiB
LUCKNOW
 
9337
SIRSA
 
6575
NELLORE
 
5859
KANPUR
 
4573
INDORE
 
4024
Other values (87)
86634
ValueCountFrequency (%) 
LUCKNOW93377.3%
 
SIRSA65755.1%
 
NELLORE58594.6%
 
KANPUR45733.6%
 
INDORE40243.1%
 
MIRYALGUDA39783.1%
 
JAIPUR37242.9%
 
SINDHANUR37192.9%
 
AHMEDABAD AMBAVADI36542.8%
 
JABALPUR35342.7%
 
Other values (82)6802552.9%
 
(Missing)116539.1%
 

Length

Max length28
Median length7
Mean length7.974326688
Min length3

Tenure
Real number (ℝ≥0)

Distinct count141
Unique (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean42.46477789436866
Minimum5
Maximum501
Zeros0
Zeros (%)0.0%
Memory size1005.1 KiB

Quantile statistics

Minimum5
5-th percentile22
Q136
median36
Q348
95-th percentile60
Maximum501
Range496
Interquartile range (IQR)12

Descriptive statistics

Standard deviation23.53397173
Coefficient of variation (CV)0.5541998074
Kurtosis53.56308226
Mean42.46477789
Median Absolute Deviation (MAD)12
Skewness6.02575669
Sum5463306
Variance553.8478255
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
363697728.7%
 
482567220.0%
 
241432711.1%
 
60122099.5%
 
4248643.8%
 
1234822.7%
 
5420751.6%
 
3320221.6%
 
3020081.6%
 
3518171.4%
 
Other values (131)2320218.0%
 
ValueCountFrequency (%) 
52< 0.1%
 
66< 0.1%
 
74< 0.1%
 
85< 0.1%
 
916< 0.1%
 
ValueCountFrequency (%) 
5011< 0.1%
 
3541< 0.1%
 
3423< 0.1%
 
3365< 0.1%
 
33021< 0.1%
 

AssetCost
Real number (ℝ≥0)

Distinct count7835
Unique (%)6.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean555024.6847926625
Minimum200000
Maximum2250000
Zeros0
Zeros (%)0.0%
Memory size1005.1 KiB

Quantile statistics

Minimum200000
5-th percentile388000
Q1500000
median550735
Q3611000
95-th percentile700000
Maximum2250000
Range2050000
Interquartile range (IQR)111000

Descriptive statistics

Standard deviation108303.551
Coefficient of variation (CV)0.1951328543
Kurtosis21.54192698
Mean555024.6848
Median Absolute Deviation (MAD)55735
Skewness1.912221804
Sum7.140670082e+10
Variance1.172965915e+10
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
55000056484.4%
 
60000045943.6%
 
50000044133.4%
 
56000039923.1%
 
58000036302.8%
 
53000033292.6%
 
54000032092.5%
 
65000031462.4%
 
52000029852.3%
 
57000027472.1%
 
Other values (7825)9096270.7%
 
ValueCountFrequency (%) 
2000001< 0.1%
 
2100002< 0.1%
 
2150001< 0.1%
 
2182151< 0.1%
 
2190003< 0.1%
 
ValueCountFrequency (%) 
22500001< 0.1%
 
22218881< 0.1%
 
21995251< 0.1%
 
21800001< 0.1%
 
21787501< 0.1%
 

AmountFinance
Real number (ℝ≥0)

HIGH CORRELATION

Distinct count20439
Unique (%)15.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean348309.63386576495
Minimum50000.0
Maximum1308351.0
Zeros0
Zeros (%)0.0%
Memory size1005.1 KiB

Quantile statistics

Minimum50000
5-th percentile195000
Q1290000
median350000
Q3410000
95-th percentile503644.154
Maximum1308351
Range1258351
Interquartile range (IQR)120000

Descriptive statistics

Standard deviation105545.3424
Coefficient of variation (CV)0.3030215996
Kurtosis2.427430198
Mean348309.6339
Median Absolute Deviation (MAD)60000
Skewness0.4365686939
Sum4.481177594e+10
Variance1.11398193e+10
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
3000001406310.9%
 
40000090167.0%
 
35000076515.9%
 
20000068965.4%
 
25000051494.0%
 
45000030942.4%
 
50000024891.9%
 
15000016461.3%
 
38000014691.1%
 
37000014141.1%
 
Other values (20429)7576858.9%
 
ValueCountFrequency (%) 
500001< 0.1%
 
550001< 0.1%
 
700001< 0.1%
 
750001< 0.1%
 
800003< 0.1%
 
ValueCountFrequency (%) 
13083511< 0.1%
 
13028221< 0.1%
 
13000002< 0.1%
 
12750001< 0.1%
 
12572391< 0.1%
 

DisbursalAmount
Real number (ℝ≥0)

HIGH CORRELATION

Distinct count19412
Unique (%)15.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean347930.56846465357
Minimum2894.0
Maximum1308351.0
Zeros0
Zeros (%)0.0%
Memory size1005.1 KiB

Quantile statistics

Minimum2894
5-th percentile195000
Q1290000
median350000
Q3410000
95-th percentile503035.3
Maximum1308351
Range1305457
Interquartile range (IQR)120000

Descriptive statistics

Standard deviation105319.8355
Coefficient of variation (CV)0.3027035996
Kurtosis2.454799098
Mean347930.5685
Median Absolute Deviation (MAD)60000
Skewness0.4323927592
Sum4.476300729e+10
Variance1.109226774e+10
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
3000001416411.0%
 
40000091477.1%
 
35000077066.0%
 
20000069435.4%
 
25000051754.0%
 
45000031492.4%
 
50000025502.0%
 
15000016571.3%
 
38000014831.2%
 
37000014241.1%
 
Other values (19402)7525758.5%
 
ValueCountFrequency (%) 
28941< 0.1%
 
30001< 0.1%
 
40631< 0.1%
 
55741< 0.1%
 
95031< 0.1%
 
ValueCountFrequency (%) 
13083511< 0.1%
 
13028221< 0.1%
 
13000002< 0.1%
 
12750001< 0.1%
 
12572391< 0.1%
 

EMI
Real number (ℝ≥0)

Distinct count24323
Unique (%)18.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean55072.75850600443
Minimum0.0
Maximum460000.0
Zeros1
Zeros (%)< 0.1%
Memory size1005.1 KiB

Quantile statistics

Minimum0
5-th percentile10484.7
Q132500
median59700
Q373800
95-th percentile97950
Maximum460000
Range460000
Interquartile range (IQR)41300

Descriptive statistics

Standard deviation28910.11174
Coefficient of variation (CV)0.5249439564
Kurtosis3.740022756
Mean55072.75851
Median Absolute Deviation (MAD)19743
Skewness0.6874947937
Sum7085385746
Variance835794560.8
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
2000019111.5%
 
3000018641.4%
 
2500013991.1%
 
5000012351.0%
 
150009880.8%
 
600009560.7%
 
400009370.7%
 
650009270.7%
 
100009190.7%
 
750008940.7%
 
Other values (24313)11662590.6%
 
ValueCountFrequency (%) 
01< 0.1%
 
1001< 0.1%
 
2001< 0.1%
 
3001< 0.1%
 
3701< 0.1%
 
ValueCountFrequency (%) 
4600001< 0.1%
 
4500001< 0.1%
 
4300001< 0.1%
 
3900001< 0.1%
 
3897001< 0.1%
 
Distinct count2837
Unique (%)2.2%
Missing0
Missing (%)0.0%
Memory size1005.1 KiB
Minimum2010-02-01 00:00:00
Maximum2019-11-23 00:00:00
Histogram
Distinct count980
Unique (%)0.8%
Missing1
Missing (%)< 0.1%
Memory size1005.1 KiB
Minimum2010-11-15 00:00:00
Maximum2059-08-10 00:00:00
Histogram
Distinct count2711
Unique (%)2.1%
Missing0
Missing (%)0.0%
Memory size1005.1 KiB
Minimum2010-02-03 00:00:00
Maximum2019-11-23 00:00:00
Histogram

AssetID
Real number (ℝ≥0)

UNIQUE

Distinct count128655
Unique (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean16397728.99015973
Minimum422271
Maximum37066666
Zeros0
Zeros (%)0.0%
Memory size1005.1 KiB

Quantile statistics

Minimum422271
5-th percentile910238.8
Q19438948.5
median15133927
Q325333957.5
95-th percentile30298259
Maximum37066666
Range36644395
Interquartile range (IQR)15895009

Descriptive statistics

Standard deviation9539540.249
Coefficient of variation (CV)0.5817598434
Kurtosis-1.130064666
Mean16397728.99
Median Absolute Deviation (MAD)9005918
Skewness-0.09448442021
Sum2.109649823e+12
Variance9.100282816e+13
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
144252381< 0.1%
 
268562101< 0.1%
 
4328881< 0.1%
 
256713181< 0.1%
 
140944761< 0.1%
 
231575041< 0.1%
 
58539531< 0.1%
 
244620831< 0.1%
 
84856361< 0.1%
 
270958131< 0.1%
 
Other values (128645)128645> 99.9%
 
ValueCountFrequency (%) 
4222711< 0.1%
 
4224031< 0.1%
 
4225621< 0.1%
 
4231301< 0.1%
 
4231521< 0.1%
 
ValueCountFrequency (%) 
370666661< 0.1%
 
369772681< 0.1%
 
369446101< 0.1%
 
368208371< 0.1%
 
367626421< 0.1%
 

ManufacturerID
Real number (ℝ≥0)

Distinct count10
Unique (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1171.3272706074385
Minimum1019
Maximum3473
Zeros0
Zeros (%)0.0%
Memory size1005.1 KiB

Quantile statistics

Minimum1019
5-th percentile1019
Q11046
median1062
Q31186
95-th percentile1568
Maximum3473
Range2454
Interquartile range (IQR)140

Descriptive statistics

Standard deviation257.7902082
Coefficient of variation (CV)0.2200838439
Kurtosis37.64662117
Mean1171.327271
Median Absolute Deviation (MAD)43
Skewness5.071646502
Sum150697110
Variance66455.79143
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
10462530419.7%
 
11862379718.5%
 
10621872814.6%
 
15681632012.7%
 
10601580212.3%
 
10191538612.0%
 
1187121609.5%
 
34737450.6%
 
26084120.3%
 
27331< 0.1%
 
ValueCountFrequency (%) 
10191538612.0%
 
10462530419.7%
 
10601580212.3%
 
10621872814.6%
 
11862379718.5%
 
ValueCountFrequency (%) 
34737450.6%
 
27331< 0.1%
 
26084120.3%
 
15681632012.7%
 
1187121609.5%
 

SupplierID
Real number (ℝ≥0)

Distinct count4539
Unique (%)3.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean56551.226209630404
Minimum5879
Maximum145518
Zeros0
Zeros (%)0.0%
Memory size1005.1 KiB

Quantile statistics

Minimum5879
5-th percentile21701
Q124093
median39146
Q387034
95-th percentile129070
Maximum145518
Range139639
Interquartile range (IQR)62941

Descriptive statistics

Standard deviation36399.35528
Coefficient of variation (CV)0.6436528032
Kurtosis-0.7636344882
Mean56551.22621
Median Absolute Deviation (MAD)16195
Skewness0.7704984845
Sum7275598008
Variance1324913065
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
4270714651.1%
 
2901213011.0%
 
3883210390.8%
 
8981310040.8%
 
273287350.6%
 
239687350.6%
 
243906340.5%
 
233885980.5%
 
327085840.5%
 
273615260.4%
 
Other values (4529)12003493.3%
 
ValueCountFrequency (%) 
587946< 0.1%
 
20002970.1%
 
2005534< 0.1%
 
200607< 0.1%
 
201054320.3%
 
ValueCountFrequency (%) 
1455181< 0.1%
 
1451941< 0.1%
 
1448672< 0.1%
 
1448581< 0.1%
 
1447871< 0.1%
 

LTV
Real number (ℝ)

Distinct count7988
Unique (%)6.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean59.12622571994871
Minimum-1.38
Maximum100.0
Zeros1
Zeros (%)< 0.1%
Memory size1005.1 KiB

Quantile statistics

Minimum-1.38
5-th percentile30.747
Q149.14
median60.78
Q370.59
95-th percentile81.82
Maximum100
Range101.38
Interquartile range (IQR)21.45

Descriptive statistics

Standard deviation15.53903766
Coefficient of variation (CV)0.2628112563
Kurtosis-0.1455309886
Mean59.12622572
Median Absolute Deviation (MAD)10.65
Skewness-0.4553036387
Sum7606884.57
Variance241.4616914
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
66.679720.8%
 
807070.5%
 
62.56040.5%
 
71.435920.5%
 
505760.4%
 
755710.4%
 
605510.4%
 
705460.4%
 
55.564940.4%
 
72.734270.3%
 
Other values (7978)12261595.3%
 
ValueCountFrequency (%) 
-1.381< 0.1%
 
01< 0.1%
 
1.451< 0.1%
 
1.921< 0.1%
 
5.221< 0.1%
 
ValueCountFrequency (%) 
1001< 0.1%
 
99.461< 0.1%
 
98.71< 0.1%
 
98.631< 0.1%
 
98.521< 0.1%
 

SEX
Categorical

Distinct count2
Unique (%)< 0.1%
Missing59
Missing (%)< 0.1%
Memory size1005.1 KiB
M
122144
F
 
6452
ValueCountFrequency (%) 
M12214494.9%
 
F64525.0%
 
(Missing)59< 0.1%
 

Length

Max length3
Median length1
Mean length1.000917182
Min length1

AGE
Real number (ℝ≥0)

Distinct count73
Unique (%)0.1%
Missing59
Missing (%)< 0.1%
Infinite0
Infinite (%)0.0%
Mean40.66480294876979
Minimum18.0
Maximum90.0
Zeros0
Zeros (%)0.0%
Memory size1005.1 KiB

Quantile statistics

Minimum18
5-th percentile23
Q131
median40
Q349
95-th percentile61
Maximum90
Range72
Interquartile range (IQR)18

Descriptive statistics

Standard deviation11.71284679
Coefficient of variation (CV)0.2880340229
Kurtosis-0.4793653823
Mean40.66480295
Median Absolute Deviation (MAD)9
Skewness0.3491376672
Sum5229331
Variance137.1907798
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
3539263.1%
 
4039183.0%
 
3339083.0%
 
3738893.0%
 
3438893.0%
 
4138553.0%
 
3638483.0%
 
3838343.0%
 
3937392.9%
 
3136952.9%
 
Other values (63)9009570.0%
 
ValueCountFrequency (%) 
182280.2%
 
195620.4%
 
209410.7%
 
2114191.1%
 
2217561.4%
 
ValueCountFrequency (%) 
901< 0.1%
 
891< 0.1%
 
881< 0.1%
 
871< 0.1%
 
862< 0.1%
 

MonthlyIncome
Real number (ℝ≥0)

SKEWED

Distinct count11568
Unique (%)9.0%
Missing234
Missing (%)0.2%
Infinite0
Infinite (%)0.0%
Mean50323.60434087884
Minimum0.0
Maximum617477500.0
Zeros27
Zeros (%)< 0.1%
Memory size1005.1 KiB

Quantile statistics

Minimum0
5-th percentile13250
Q125000
median35833.33
Q350000
95-th percentile92923.33
Maximum617477500
Range617477500
Interquartile range (IQR)25000

Descriptive statistics

Standard deviation1724607.365
Coefficient of variation (CV)34.27034664
Kurtosis127926.2107
Mean50323.60434
Median Absolute Deviation (MAD)11666.66
Skewness357.3273301
Sum6462607593
Variance2.974270562e+12
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
41666.6796097.5%
 
2500087596.8%
 
33333.3385926.7%
 
5000054414.2%
 
29166.6744063.4%
 
16666.6741513.2%
 
20833.3333502.6%
 
3750033442.6%
 
58333.3328572.2%
 
66666.6720611.6%
 
Other values (11558)7585159.0%
 
ValueCountFrequency (%) 
027< 0.1%
 
0.081540.1%
 
0.1750< 0.1%
 
0.258< 0.1%
 
0.336< 0.1%
 
ValueCountFrequency (%) 
6174775001< 0.1%
 
11666666.671< 0.1%
 
70000001< 0.1%
 
64620001< 0.1%
 
52500001< 0.1%
 

City
Categorical

HIGH CARDINALITY
MISSING

Distinct count488
Unique (%)0.4%
Missing11256
Missing (%)8.7%
Memory size1005.1 KiB
NALGONDA
 
2231
HISAR
 
1790
GUNTUR
 
1756
NELLORE
 
1634
JODHPUR
 
1632
Other values (483)
108356
ValueCountFrequency (%) 
NALGONDA22311.7%
 
HISAR17901.4%
 
GUNTUR17561.4%
 
NELLORE16341.3%
 
JODHPUR16321.3%
 
BHIWANI15351.2%
 
MANDSAUR15071.2%
 
KARIMNAGAR14811.2%
 
RAICHUR14231.1%
 
KHAMMAM13351.0%
 
Other values (478)10107578.6%
 
(Missing)112568.7%
 

Length

Max length26
Median length7
Mean length7.203264545
Min length1

State
Categorical

Distinct count22
Unique (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1005.1 KiB
MADHYA PRADESH
19122
ANDHRA PRADESH
18629
UTTAR PRADESH
14449
KARNATAKA
11569
GUJARAT
 
10088
Other values (17)
54798
ValueCountFrequency (%) 
MADHYA PRADESH1912214.9%
 
ANDHRA PRADESH1862914.5%
 
UTTAR PRADESH1444911.2%
 
KARNATAKA115699.0%
 
GUJARAT100887.8%
 
RAJASTHAN96687.5%
 
MAHARASHTRA90837.1%
 
HARYANA90607.0%
 
PUNJAB64205.0%
 
WEST BENGAL55574.3%
 
Other values (12)1501011.7%
 

Length

Max length22
Median length11
Mean length10.59959582
Min length5

ZiPCODE
Real number (ℝ≥0)

Distinct count9123
Unique (%)7.1%
Missing372
Missing (%)0.3%
Infinite0
Infinite (%)0.0%
Mean427931.09972482716
Minimum110000.0
Maximum855456.0
Zeros0
Zeros (%)0.0%
Memory size1005.1 KiB

Quantile statistics

Minimum110000
5-th percentile127021
Q1304804
median458553
Q3521131
95-th percentile752100.9
Maximum855456
Range745456
Interquartile range (IQR)216327

Descriptive statistics

Standard deviation175704.364
Coefficient of variation (CV)0.4105903127
Kurtosis-0.2735939009
Mean427931.0997
Median Absolute Deviation (MAD)97393
Skewness0.07648527142
Sum5.489628527e+10
Variance3.087202354e+10
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
1250018060.6%
 
1250556900.5%
 
5841285240.4%
 
1250504090.3%
 
3340014070.3%
 
3423014030.3%
 
3423033950.3%
 
4642283630.3%
 
2610013500.3%
 
4950013480.3%
 
Other values (9113)12358896.1%
 
(Missing)3720.3%
 
ValueCountFrequency (%) 
1100001< 0.1%
 
1100031< 0.1%
 
1100121< 0.1%
 
1100161< 0.1%
 
1100361< 0.1%
 
ValueCountFrequency (%) 
8554561< 0.1%
 
8551178< 0.1%
 
8551167< 0.1%
 
8551157< 0.1%
 
8551149< 0.1%
 

Top-up Month
Categorical

Distinct count7
Unique (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1005.1 KiB
No Top-up Service
106677
> 48 Months
 
8366
36-48 Months
 
3656
24-30 Months
 
3492
30-36 Months
 
3062
Other values (2)
 
3402
ValueCountFrequency (%) 
No Top-up Service10667782.9%
 
> 48 Months83666.5%
 
36-48 Months36562.8%
 
24-30 Months34922.7%
 
30-36 Months30622.4%
 
18-24 Months23681.8%
 
12-18 Months10340.8%
 

Length

Max length17
Median length17
Mean length16.14585519
Min length12

Interactions

Correlations

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.

Cramér's V (φc)

Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.

Missing values

Sample

First rows

IDFrequencyInstlmentModeLoanStatusPaymentModeBranchIDAreaTenureAssetCostAmountFinanceDisbursalAmountEMIDisbursalDateMaturityDAteAuthDateAssetIDManufacturerIDSupplierIDLTVSEXAGEMonthlyIncomeCityStateZiPCODETop-up Month
01MonthlyArrearClosedPDC_E1NaN48450000275000.0275000.024000.02012-02-102016-01-152012-02-10402246515682194661.11M49.035833.33RAISENMADHYA PRADESH464993.0> 48 Months
12MonthlyAdvanceClosedPDC333BHOPAL47485000350000.0350000.010500.02012-03-312016-02-152012-03-31468117510623480270.00M23.0666.67SEHOREMADHYA PRADESH466001.0No Top-up Service
23QuatrlyArrearActiveDirect Debit1NaN68690000519728.0519728.038300.02017-06-172023-02-102017-06-1725328146106012733569.77M39.045257.00BHOPALMADHYA PRADESH462030.012-18 Months
37MonthlyAdvanceClosedBilled125GUNA48480000400000.0400000.011600.02013-11-292017-11-102013-11-291302159110602509480.92M24.020833.33ASHOK NAGARMADHYA PRADESH473335.0> 48 Months
48MonthlyArrearClosedBilled152BILASPUR44619265440000.0440000.015000.02011-12-082015-07-052011-12-08329132010462185371.05M56.027313.67BILASPURCHATTISGARH495442.036-48 Months
59MonthlyArrearClosedBilled5RAIPUR48400000280000.0280000.053000.02011-12-192015-12-152011-12-19341301210195468970.00M40.042083.33RAIPURCHATTISGARH493885.0No Top-up Service
610MonthlyArrearClosedPDC_E5RAIPUR48716000450000.0450000.02000.02011-12-312015-12-052011-12-31355357910195468962.85M23.046221.00RAIPURCHATTISGARH493889.0No Top-up Service
711MonthlyArrearClosedPDC_E5RAIPUR48600000360000.0360000.011000.02012-02-092016-02-152012-02-09400823411872123260.00M41.046195.08RAIPURCHATTISGARH493114.0No Top-up Service
812MonthlyArrearClosedPDC5RAIPUR46539275400000.0400000.050000.02012-03-292016-01-152012-03-29460321710462476074.17M51.015000.00RAIPURCHATTISGARH493196.0No Top-up Service
913MonthlyArrearClosedPDC5RAIPUR48689275490000.0490000.010000.02012-03-302016-02-152012-03-30461983610462476071.09M33.031666.67RAIPURCHATTISGARH493344.0No Top-up Service

Last rows

IDFrequencyInstlmentModeLoanStatusPaymentModeBranchIDAreaTenureAssetCostAmountFinanceDisbursalAmountEMIDisbursalDateMaturityDAteAuthDateAssetIDManufacturerIDSupplierIDLTVSEXAGEMonthlyIncomeCityStateZiPCODETop-up Month
128645143385Half YearlyArrearActiveDirect Debit414PALWAL24450000200000.0200000.060248.02018-11-132020-11-052018-11-1332009983156810530031.06M47.038083.33FARIDABADHARYANA121106.0No Top-up Service
128646143386QuatrlyArrearActiveDirect Debit414PALWAL24350000200000.0200000.029268.02018-11-132020-10-052018-11-1332010196156810530057.14M24.042000.00FARIDABADHARYANA121105.0No Top-up Service
128647143387Half YearlyArrearActiveDirect Debit414PALWAL30490000250373.0250373.075100.02018-11-142020-12-052018-11-1532025240156813612635.69M43.038333.33GURGAONHARYANA122103.012-18 Months
128648143388QuatrlyArrearActiveDirect Debit414PALWAL24350000200000.0200000.029217.02019-01-302020-12-052019-01-3033047682156813906857.14M51.034150.00FARIDABADHARYANA121103.018-24 Months
128649143389QuatrlyArrearClosedDirect Debit416MANDLA12650000200000.0200000.054370.02019-01-182019-12-052019-01-183288472815688591230.77M39.069083.33MANDLAMADHYA PRADESH481661.018-24 Months
128650143390Half YearlyArrearClosedDirect Debit424PANIPAT24470000265601.0265601.076800.02018-09-212020-06-052018-09-223128691415684887940.17M25.065333.33SONIPATHARYANA131403.024-30 Months
128651143391Half YearlyArrearClosedDirect Debit424PANIPAT24460000275630.0275630.080100.02018-09-222020-06-052018-09-223129542215684887959.92M25.083333.33SONIPATHARYANA131403.0No Top-up Service
128652143393MonthlyArrearActiveDirect Debit424PANIPAT23545000300733.0300733.015277.02018-11-232020-11-052018-11-233214562915684411852.38M36.0248500.00SONIPATHARYANA131024.0No Top-up Service
128653143394Half YearlyArrearActiveDirect Debit424PANIPAT35350000250962.0250962.074341.02018-12-202021-06-052018-12-203250986615684887950.37M37.084500.00SONIPATHARYANA131103.0No Top-up Service
128654143395Half YearlyArrearActiveDirect Debit424PANIPAT24370000200428.0200428.059200.02018-12-312020-11-052018-12-313266375415684887954.17M33.0178166.67SONIPATHARYANA131402.0No Top-up Service